Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add filters

Language
Document Type
Year range
1.
Applied Sciences ; 13(4):2062, 2023.
Article in English | ProQuest Central | ID: covidwho-2257015

ABSTRACT

Social media platforms have become a substratum for people to enunciate their opinions and ideas across the globe. Due to anonymity preservation and freedom of expression, it is possible to humiliate individuals and groups, disregarding social etiquette online, inevitably proliferating and diversifying the incidents of cyberbullying and cyber hate speech. This intimidating problem has recently sought the attention of researchers and scholars worldwide. Still, the current practices to sift the online content and offset the hatred spread do not go far enough. One factor contributing to this is the recent prevalence of regional languages in social media, the dearth of language resources, and flexible detection approaches, specifically for low-resource languages. In this context, most existing studies are oriented towards traditional resource-rich languages and highlight a huge gap in recently embraced resource-poor languages. One such language currently adopted worldwide and more typically by South Asian users for textual communication on social networks is Roman Urdu. It is derived from Urdu and written using a Left-to-Right pattern and Roman scripting. This language elicits numerous computational challenges while performing natural language preprocessing tasks due to its inflections, derivations, lexical variations, and morphological richness. To alleviate this problem, this research proposes a cyberbullying detection approach for analyzing textual data in the Roman Urdu language based on advanced preprocessing methods, voting-based ensemble techniques, and machine learning algorithms. The study has extracted a vast number of features, including statistical features, word N-Grams, combined n-grams, and BOW model with TFIDF weighting in different experimental settings using GridSearchCV and cross-validation techniques. The detection approach has been designed to tackle users' textual input by considering user-specific writing styles on social media in a colloquial and non-standard form. The experimental results show that SVM with embedded hybrid N-gram features produced the highest average accuracy of around 83%. Among the ensemble voting-based techniques, XGboost achieved the optimal accuracy of 79%. Both implicit and explicit Roman Urdu instances were evaluated, and the categorization of severity based on prediction probabilities was performed. Time complexity is also analyzed in terms of execution time, indicating that LR, using different parameters and feature combinations, is the fastest algorithm. The results are promising with respect to standard assessment metrics and indicate the feasibility of the proposed approach in cyberbullying detection for the Roman Urdu language.

2.
22nd Annual International Conference on Computational Science, ICCS 2022 ; 13350 LNCS:584-598, 2022.
Article in English | Scopus | ID: covidwho-1958882

ABSTRACT

Cyberbullying is an aggressive and intentional behavior committed by groups or individuals, and its main manifestation is to make offensive or hurtful comments on social media. The existing researches on cyberbullying detection underuse natural language processing technology, and is only limited to extracting the features of comment content. Meanwhile, the existing datasets for cyberbullying detection are non-standard, unbalanced, and the data content of datasets is relatively outdated. In this paper, we propose a novel Hybrid deep Model based on Multi-feature Fusion (HMMF), which can model the content of news comments and the side information related to net users and comments simultaneously, to improve the performance of cyberbullying detection. In addition, we present the JRTT: a new, publicly available benchmark dataset for cyberbullying detection. All the data are collected from social media platforms which contains Chinese comments on COVID-19 news. To evaluate the effectiveness of HMMF, we conduct extensive experiments on JRTT dataset with five existing pre-trained language models. Experimental results and analyses show that HMMF achieves state-of-the-art performances on cyberbullying detection. To facilitate research in this direction, we release the dataset and the project code at https://github.com/xingjian215/HMMF. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

3.
2021 IEEE International Conference on Big Data, Big Data 2021 ; : 2442-2453, 2021.
Article in English | Scopus | ID: covidwho-1730869

ABSTRACT

People can easily reveal their aggressive remarks on social media platforms using the anonymity it provides. During the COVID-19 pandemic, the usage of social media has been increased several times according to surveys and people are vulnerable to cyber attacks now more than ever. Prevention of cyberbullying needs careful monitoring and identification. Most of the existing works on cyberbullying detection employed traditional machine learning classifiers with handcrafted fea-tures, and deep learning-based models have made their way in this domain very recently. Categorizing cyberbullying based on traits is a complex task and needs contextual consideration. In this work, we have proposed a new approach to detect cyberbullying on social media platforms using a neural ensemble method of transformer-based architectures with attention mechanism. Our proposed architecture is trained on one balanced and one imbalanced dataset and outperforms the given ML and DNN baselines by a significant margin in both cases. We achieved an average F1-score of 95.59% for five classes and 90.65% for six classes on the Fine-Grained Cyberbullying Dataset (FGCD), and 87.28% on Twitter parsed dataset. Our in-depth results provide great insights into the effectiveness of transformer-based models in cyberbullying detection and paves the way for future researches to combat this serious online issue. We have released our models and code.1 © 2021 IEEE.

SELECTION OF CITATIONS
SEARCH DETAIL